Dynamic Time Warping-based multiple imputation
Or
A series of desparate pleadings with the time series gods

Danielle McCool

12/6/2021

Getcha up to speed

Me (Danielle)

UU and CBS

Last year of my PhD

The pitch

Mobility surveys

What if we used sensors?

A mobile phone app!

Lessons

Participants participate

More (short) trips with the app

Short gaps? Interpolate!

Everybody hates long gaps

Travel behavior as we’d like it

  • Time series of coordinates
  • Very precise
  • Very accurate
  • Very useful

Travel as we often get it

  • Incomplete time series of coordinates
  • Very precise
  • Unknown accuracy
  • Usefulness?

Guessing smart on missing data

  • Take a hint from other disciplines
  • Multiple imputation

How to understand multiple imputation

The plan

  1. Imputing someone’s exact location

The plan

  1. Imputing someone’s exact location

The plan

  1. Imputing aggregate characteristics
    • Distance
    • Mode of transportation
    • Radius of gyration

The plan

  1. Imputing aggregate characteristics
    • Distance
    • Mode of transportation
    • Radius of gyration

The plan

  1. Imputing aggregate characteristics
    • Distance
    • Mode of transportation
    • Radius of gyration

The plan pt 2

  1. Imputing aggregate characteristics
  2. Incorporating time as an element

The time series data

Distance in Km per hour by user
8h 9h 10h 11h 12h 13h 14h 15h 16h 17h 18h 19h
id1 0.0 0.3 0.6 0.1 0.9 0.5 0.0 0.0 3.9 3.0 1.0 0.0
id2 0.0 0.1 0.5 0.0 1.2 1.0 0.5 0.0 4.1 2.3 1.9 0.7
id3 0.0 0.0 0.6 3.7 0.0 2.9 0.0 2.5 0.3 0.0 0.0 0.0
id4 7.0 0.0 0.0 0.0 0.0 0.0 0.0 1.1 0.0 1.3 1.1 4.5
id5 1.6 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 7.9 0.0
id6 1.5 4.2 0.0 0.4 0.0 0.0 0.1 0.0 2.1 0.0 7.1 4.8

To cumuluative or not to cumulative

Distance of first 5 24-hour days.

Distance of first 5 24-hour days.

Cum. distance in meters of first 5  24-hour days.

Cum. distance in meters of first 5 24-hour days.

Deepdive

Cum. distance in meters of first 5  24-hour days.

Cum. distance in meters of first 5 24-hour days.

We’ve fixed the missing data problem

Cum. distance in meters of first 2  24-hour days.

Cum. distance in meters of first 2 24-hour days.

Oh, right

Cum. distance in meters of two days from one person.

Cum. distance in meters of two days from one person.

But maybe there’s something to this

Cum. distance in meters of two days from two different people.

Cum. distance in meters of two days from two different people.

Dynamic time warping

  • A method of aligning time series
  • Can handle shifts in time
  • Provides us with a distance metric
  • We can select the best candidates for imputation!

Use in multiple imputation

Different distance weights provide different candidates

Different distance weights provide different candidates

Another one

Different distance weights provide different candidates

Different distance weights provide different candidates

Next steps

Middle missingness

Radius of Gyration

Allowing time shifts

Evaluation criteria

Next steps

Middle missingness

Radius of Gyration

Allowing time shifts

Evaluation criteria

Next steps

Middle missingness

Radius of Gyration

Allowing time shifts

Evaluation criteria

Next steps

Middle missingness

Radius of Gyration

Allowing time shifts

Evaluation criteria

In conclusion

  1. Long gaps require more complex solutions
  2. We can turn trajectory metrics into time series
  3. These time series have similarities within and between people
  4. Dynamic time warping allows us to align on time series similarities
  5. We can use multiple similar candidates in order to provide estimates for the metrics occurring during the missing time periods

Image attributions